Corpus and Voices for Catalan Speech Synthesis
نویسندگان
چکیده
In this paper we describe the design and production of a Catalan database for building synthetic voices. Two speakers have recorded 10 hours of speech each one. The speaker selection and the corpus design aim to provide resources for high quality synthesis. In fact, as a side effect, in the speaker selection proccess we have produced 10 databases of 1 hour each one which allows producing medium quality speech synthesis. The resources have been used to build voices for the Festival TTS. Both the original recordings and the Festival databases are freely available for research and for commercial use.
منابع مشابه
Recent work on the FESTCAT database for speech synthesis
This paper presents our work around the FESTCAT project, whose main goal was the development of voices for the Festival suite in Catalan. In the first year, we produced the corpus and the speech data needed for build 10 voices using the Clunits (unit selection) and the HTS (Markov models) methods. The resulting voices are freely available on the web page of the project and included in Linkat, a...
متن کاملSynthesis using Speaker Adaptation from Speech Recognition DB
This paper deals with the creation of multiple voices from a Hidden Markov Model based speech synthesis system (HTS). More than 150 Catalan synthetic voices were built using Hidden Markov Models (HMM) and speaker adaptation techniques. Training data for building a Speaker-Independent (SI) model were selected from both a general purpose speech synthesis database (FestCat;) and a database designe...
متن کاملSynthesis and evaluation of conversational characteristics in speech synthesis
Conventional synthetic voices can synthesise neutral read aloud speech well. But, to make synthetic speech more suitable for a wider range of applications, the voices need to express more than just the word identity. We need to develop voices that can partake in a conversation and express, e.g. agreement, disagreement, hesitation, in a natural and believable manner. In speech synthesis there ar...
متن کاملIntelligibility analysis of fast synthesized speech
In this paper we analyse the effect of speech corpus and compression method on the intelligibility of synthesized speech at fast rates. We recorded English and German language voice talents at a normal and a fast speaking rate and trained an HSMMbased synthesis system based on the normal and the fast data of each speaker. We compared three compression methods: scaling the variance of the state ...
متن کاملFrequency analysis of phonetic units for concatenative synthesis in catalan
Knowledge of phonetic unit frequency is very necessary for developing databases in both concatenative synthesis and continuous speech recognition. In the present work, a large corpus of text was processed and phonetically transcribed to obtain allophone and diphone frequencies for the Catalan language. The corpus was acquired from newspaper articles, in which there were a lot of foreign words t...
متن کامل